Combining Markov Decision Processes with Linear Optimal Controllers
نویسندگان
چکیده
Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear problems do not [2]. The state of the art method used to find approximate solutions to non-linear control problems (iterative LQG) [3] carries a large computational cost associated with iterative calculations [4]. We propose a novel approach for solving nonlinear Optimal Control (OC) problems which combines Reinforcement Learning (RL) with OC. The new algorithm, RLOC, uses a small set of localized optimal linear controllers and applies a Monte Carlo algorithm that learns the mapping from the state space to controllers. We illustrate our approach by solving a non-linear OC problem of the 2-joint arm operating in a plane with two point masses. We show that controlling the arm with the RLOC is less costly than using the Linear Quadratic Regulator (LQR). This finding shows that non-linear optimal control problems can be solved using a novel approach of adaptive RL.
منابع مشابه
Finding Optimal POMDP Controllers Using Quadratically Constrained Linear Programs
Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One promising approach is based on representing POMDP policies as finite-state controllers. This method has been used successfully to address the intractable memory requirements of POMDP algorithms. We illustrate some fundamental theoretical limitations of existing techn...
متن کاملOptimal Process Control of Symbolic Transfer Functions
Transfer function modeling is a standard technique in classical Linear Time Invariant and Statistical Process Control. The work of Box and Jenkins was seminal in developing methods for identifying parameters associated with classical (r, s, k) transfer functions. Computing systems are often fundamentally discrete and feedback control in these situations may require discrete event systems for mo...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملExistence of Optimal Policies for Semi-Markov Decision Processes Using Duality for Infinite Linear Programming
Semi-Markov decision processes on Borel spaces with deterministic kernels have many practical applications, particularly in inventory theory. Most of the results from general semi-Markov decision processes do not carry over to a deterministic kernel since such a kernel does not provide “smoothness.” We develop infinite dimensional linear programming theory for a general stochastic semi-Markov d...
متن کاملSynthesizing Efficient Controllers
In many situations, we are interested in controllers that implement a good trade-off between conflicting objectives, e.g., the speed of a car versus its fuel consumption, or the transmission rate of a wireless device versus its energy consumption. In both cases, we aim for a system that efficiently uses its resources. In this paper we show how to automatically construct efficient controllers. W...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011